Hyperion: High Volume Stream Archival for Retrospective Querying
نویسندگان
چکیده
Network monitoring systems that support data archival and after-the-fact (retrospective) queries are useful for a multitude of purposes, such as anomaly detection and network and security forensics. Data archival for such systems, however, is complicated by (a) data arrival rate, which may be hundreds of thousands of packets per second per link, and (b) the need for online indexing of this data to support retrospective queries. At these data rates, both common database index structures and general-purpose file systems perform poorly. This paper describes Hyperion, a system for archival, indexing, and on-line retrieval of high-volume data streams. We employ a write-optimized stream file system for high-speed storage of simultaneous data streams, and a novel use of signature file indexes in a distributed multi-level index. We implement Hyperion on commodity hardware and conduct a detailed evaluation using synthetic data and real network traces. Our streaming file system, StreamFS, is shown to be fast enough to archive traces at over a million packets per second. The entire system is able to archive over 200,000 packets/sec while allowing simultaneous on-line queries—queries over hours of data are shown to complete in as little as 10-20
منابع مشابه
Stream Traffic Data Archival , Querying , and Analysis with TransDec Final Report
Transportation assume no liability for the contents or use thereof. The contents do not necessarily reflect the official views or policies of the State of California or the Department of Transportation. This report does not constitute a standard, specification, or regulation. ABSTRACT The goal of research was to extend the traffic data analysis of the TransDec (short for Transportation Decision...
متن کاملResource - Aware Ubiquitous Data Stream Querying
—This paper proposes and develops a novel, iterative model for resource aware-ubiquitous data stream querying (RA-UDSQ). Our model provides timely results to mobile users at regular time intervals specified by the user, thereby executing continuous stream queries. This model is capable of adapting to high data rates of streams and limited memory resources available on a mobile device while exec...
متن کاملFast Data Management with Distributed Streaming SQL
To stay competitive in today’s data driven economy, enterprises large and small are turning to stream processing platforms to process high volume, high velocity, and diverse streams of data (fast data) as they arrive. Low-level programming models provided by the popular systems of today suffer from lack of responsiveness to change: enhancements require code changes with attendant large turn-aro...
متن کاملParallel Implementation of Algorithms for Endmember Extraction from Aviris Hyperspectral Imagery
Hyperspectral imaging systems, used in conjunction with appropriate detection and recognition algorithms, have demonstrated to be very useful tools in many different remote sensing applications [1]. These instruments are capable of collecting hundreds of images, corresponding to different wavelength channels, for the same area on the surface of the Earth. A chief hyperspectral sensor is the NAS...
متن کاملData Sharing in the Hyperion Peer Database System
This demo presents Hyperion, a prototype system that supports data sharing for a network of independent Peer Relational Database Management Systems (PDBMSs). The nodes of such a network are assumed to be autonomous PDBMSs that form acquaintances at run-time, and manage mapping tables to define value correspondences among different databases. They also use distributed Event-Condition-Action (ECA...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007